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Abstract 

We propose a parser for constraint-logic 
grammars implementing HPSG that com- 
bines the advantages of dynamic bottom- 
up and advanced top-down control. The 
parser allows the user to apply magic com- 
pilation to specific constraints in a gram- 
mar which as a result can be processed dy- 
namically in a bottom-up and goal-directed 
fashion. State of the art top-down process- 
ing techniques are used to deal with the 
remaining constraints. We discuss various 
aspects concerning the implementation of 
the parser as part of a grammar develop- 
ment system. 

1 Introduction 

In case space requirements of dynamic parsing of- 
ten outweigh the benefit of not duplicating sub- 
computations. We propose a parser that avoids this 
drawback through combining the advantages of dy- 
namic bottom- up and advanced top-down control. 1 
The underlying idea is to achieve faster parsing by 
avoiding tabling on sub-computations which are not 
expensive. The so-called selective magic parser al- 
lows the user to apply magic compilation to spe- 
cific constraints in a grammar which as a result can 
be processed dynamically in a bottom-up and goal- 
directed fashion. State of the art top-down process- 
ing techniques are used to deal with the remaining 
constraints. 

Magic is a compilation technique originally de- 
veloped for goal-directed bottom-up processing of 
logic programs. See, among others, (Ramakrishnan 
et al. 1992). As shown in (Minnen, 1996) magic 

"The presented research was carried out at the Uni- 
versity of Tubingen, Germany, as part of the Sonder- 
forschungsbereich 340. 

*A more detailed discussion of various aspects of the 
proposed parser can be found in (Minnen, 1998). 



is an interesting technique with respect to natural 
language processing as it incorporates filtering into 
the logic underlying the grammar and enables el- 
egant control independent filtering improvements. 
In this paper we investigate the selective applica- 
tion of magic to typed feature grammars a type of 
constraint-logic grammar based on Typed Feature 
Logic (TF£; Gotz, 1995). Typed feature gram- 
mars can be used as the basis for implementations 
of Head-driven Phrase Structure Grammar (HPSG; 
Pollard and Sag, 1994) as discussed in (Gotz and 
Meurers, 1997a) and (Meurers and Minnen, 1997). 
Typed feature grammar constraints that are inex- 
pensive to resolve are dealt with using the top- 
down interpreter of the ConTroll grammar develop- 
ment system (Gotz and Meurers, 1997b) which uses 
an advanced search function, an advanced selection 
function and incorporates a coroutining mechanism 
which supports delayed interpretation. 

The proposed parser is related to the so-called 
Lemma Table deduction system (Johnson and Dorrc, 
1995) which allows the user to specify whether top- 
down sub-computations are to be tabled. In contrast 
to Johnson and Dorre's deduction system, though, 
the selective magic parsing approach combines top- 
down and bottom-up control strategies. As such it 
resembles the parser of the grammar development 
system Attribute Language Engine (ALE) of (Car- 
penter and Penn, 1994). Unlike the ALE parser, 
though, the selective magic parser does not presup- 
pose a phrase structure backbone and is more flexi- 
ble as to which sub-computations are tabled/filtered. 

2 Bottom-up Interpretation of 
Magic-compiled Typed Feature 
Grammars 

We describe typed feature grammars and discuss 
their use in implementing HPSG grammars. Subse- 
quently we present magic compilation of typed fea- 



ture grammars on the basis of an example and in- 
troduce a dynamic bottom-up interpreter that can 
be used for goal-directed interpretation of magic- 
compiled typed feature grammars. 

2.1 Typed Feature Grammars 

A typed feature grammar consists of a signature and 
a set of definite clauses over the constraint language 
of equations of TTC (Gotz, 1995) terms (Hohfeld 
and Smolka, 1988) which we will refer to as TTL 
definite clauses. Equations over TTC terms can be 
solved using (graph) unification provided they are in 
normal form. (Gotz, 1994) describes a normal form 
for TTC terms, where typed feature structures are 
interpreted as satisfiable normal form TTC terms. 2 
The signature consists of a type hierarchy and a set 
of appropriateness conditions. 

Example 1 The signature specified in figure 1 
and 2 and the TTC definite clauses in figure 3 con- 
stitute an example of a typed feature grammar. We 
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Figure 2: Example of a typed feature grammar sig- 
nature (part 2) 

write TTC terms in normal form, i. e., as typed fea- 
ture structures. In addition, uninformative feature 
specifications are ignored and typing is left implicit 
when immaterial to the example at hand. Equa- 
tions between typed feature structures are removed 
by simple substitution or tags indicating structure 
sharing. Notice that we also use non-numerical tags 
such as | Xs | and | XsYs | . In general all boxed items 
indicate structure sharing. For expository reasons 
we represent the ARGra features of the append re- 
lation as separate arguments. 

Typed feature grammars can be used as the basis 
for implementations of Head-driven Phrase Struc- 
ture Grammar (Pollard and Sag, 1994). 3 (Meurers 
and Minnen, 1997) propose a compilation of lexical 
rules into TJ-C definite clauses which are used to 
restrict lexical entries. (Gotz and Meurers, 1997b) 

2 This view of typed feature structures differs from 
the perspective on typed feature structures as modeling 
partial information as in (Carpenter, 1992). Typed fea- 
ture structures as normal form TTL terms are merely 
syntactic objects. 

3 See (King, 1994) for a discussion of the appropri- 
ateness of TTC for HPSG and a comparison with other 
feature logic approaches designed for HPSG. 
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describe a method for compiling implicational con- 
straints into typed feature grammars and interleav- 
ing them with relational constraints. 4 Because of 
space limitations we have to refrain from an exam- 
ple. The ConTroll grammar development system 
as described in (Gotz and Meurers, 1997b) imple- 
ments the above mentioned techniques for compiling 
an HPSG theory into typed feature grammars. 

2.2 Magic Compilation 

Magic is a compilation technique for goal-directed 
bottom- up processing of logic programs. See, among 
others, (Ramakrishnan et al. 1992). Because magic 
compilation does not refer to the specific constraint 
language adopted, its application is not limited to 
logic programs/grammars: It can be applied to rela- 
tional extensions of other constraint languages such 
as typed feature grammars without further adap- 
tions. 

Due to space limitations we discuss magic com- 
pilation by example only. The interested reader is 
referred to (Nilsson and Maluszynski, 1995) for an 
introduction. 

Example 2 We illustrate magic compilation of 
typed feature grammars with respect to definite 
clause 1 in figure 3. Consider the TTC definite 

4 (G6tz, 1995) proves that this compilation method is 
sound in the general case and defines the large class of 
type constraints for which it is complete. 




CAT cat 

PHON list 

AGR agr 

SEM sent 



mary sleeps 



cat 



nelist J} third-sing maryjf sleep [sUBJ semj 



s np v 

Figure 1: Example of a typed feature grammar signature (part 1 ) 



clause in figure 4. As a result of magic compilation 
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Figure 4: Magic variant of definite clause 1 in fig- 
ure 3 

a magic literal is added to the right-hand side of the 
original definite clause. Intuitively understood, this 
magic literal "guards" the application of the definite 
clause. The clause is applied only when there exists 
a fact that unifies with this magic literal. 5 The re- 
sulting definite clause is also referred to as the magic 
variant of the original definite clause. 

The definite clause in figure 5 is the so-called seed 
which is used to make the bindings as provided by 
the initial goal available for bottom-up processing. 
In this case the seed corresponds to the initial goal 
of parsing the string 'mary sleeps'. Intuitively un- 
derstood, the seed makes available the bindings of 
the initial goal to the magic variants of the defi- 



5 A fact can be a unit clause, i. e., a TTC definite 
clause without right-hand side literals, from the gram- 
mar or derived using the rules in the grammar. In the 
latter case one also speaks of a passive edge. 
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parsing the string l mary sleeps' 



nite clauses defining a particular initial goal; in this 
case the magic variant of the definite clause defin- 
ing a constituent of category 's'. Only when their 
magic literal unifies with the seed are these clauses 
applied. 6 

The so-called magic rules in figure 6 are derived in 
order to be able to use the bindings provided by the 
seed to derive new facts that provide the bindings 
which allow for a goal-directed application of the 
definite clauses in the grammar not directly defin- 
ing the initial goal. Definite clause 3, for example, 
can be used to derive a magic_append fact which 
percolates the relevant bindings of the seed/initial 
goal to restrict the application of the magic variant 
of definite clauses 4 and 5 in figure 3 (which are not 
displayed). 

2.3 Semi-naive Bottom-up Interpretation 

Magic-compiled logic programs / grammars can be in- 
terpreted in a bottom-up fashion without losing any 
of the goal-directedncss normally associated with 
top-down interpretation using a so-called semi-naive 
bottom-up interpreter: A dynamic interpreter that 
tables only complete intermediate results, i. e., facts 
or passive edges, and uses an agenda to avoid re- 
dundant sub-computations. The Prolog predicates 



The creation of the seed can be postponed until run 
time, such that the grammar does not need to be com- 
piled for every possible initial goal. 
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Figure 6: Magic rules resulting from applying magic 
compilation to definite clause 1 in figure 3 



in figure 7 implement a semi-naive bottom-up in- 
terpreter. 7 In this interpreter both the table and 
the agenda are represented using lists. 8 The agenda 
keeps track of the facts that have not yet been used 
to update the table. It is important to notice that in 
order to use the interpreter for typed feature gram- 
mars it has to be adapted to perform graph unifica- 
tion. 9 We refrain from making the necessary adap- 
tions to the code for expository reasons. 

The table is initialized with the facts from the 
grammar. Facts are combined using a operation 
called match. The match operation unifies all but 
one of the right-hand side literals of a definite clause 



7 Definite clauses serving as data are en- 
coded using the predicate def inite_clause/l: 
def inite_clause( (Lhs :- Rhs))., where Rhs is a 
(possibly empty) list of literals. 

8 There are various other — more efficient — ways to im- 
plement a dynamic control strategy in Prolog. See, for 
example, (Shieber et al., 1995). 

9 A term encoding of typed feature structures would 
enable the use of term unification instead. See, for ex- 
ample, (Gerdemann, 1995). 



in the grammar with facts in the table. The remain- 
ing right-hand side literal is unified with a newly 
derived fact, i. e., a fact from the agenda. By do- 
ing this, repeated derivation of facts from the same 
earlier derived facts is avoided. 



semi_naive_interpret (Goal) :- 

initialization(Agenda,TableO) , 
update_table ( Agenda, TableO .Table) , 
member (edge (Goal, [] ) , Table) . 

update_table( [] .Table .Table) . 

update_table( [Edge I AgendaO] , TableO, Table) :- 
update_table_w_edge (Edge .Edges , 

TableO, Tablet) , 
append(Edges , AgendaO , Agenda) , 
update_table(Agenda, Tablet .Table) . 

update_table_w_edge (Edge , Edges , TableO , Table) : ■ 
findalK NewEdge, 

match (Edge, NewEdge, TableO) , 
Edges) , 
store (Edges, TableO, Table) . 

store ( [] , Table , Table) : - 

store ( [Edge I Edges] , TableO, Table) :- 

member (GenEdge , TableO) , 

\+ subsumes (GenEdge, Edge) , 

store (Edges, [Edge I TableO] , Table) . 
store ([_ I Edges] , TableO, Table) :- 

store (Edges, TableO, Table) . 

initialization(Edges, Edges) :- 
findalK edge (Head, [] ) , 

def inite_clause( (Head:- [])), 
Edges) . 

completion(Edge , edge (Goal , [] ) .Table) :- 
def inite_clause( (Goal :- Body)), 
Edge = edge(F, [] ) , 
select(F,Body,R) , 
edges(R, Table) . 

edges ( [] ,_) . 

edges ( [Lit |Lits] .Table) :- 

member (edge (Lit , [] ) , Table) , 
edges (Lits, Table) . 

Figure 7: Semi-naive bottom-up interpreter 



3 Selective Magic HPSG Parsing 

In case of large grammars the huge space require- 
ments of dynamic processing often nullify the ben- 
efit of tabling intermediate results. By combining 
control strategies and allowing the user to specify 
how to process particular constraints in the gram- 
mar the selective magic parser avoids this problem. 
This solution is based on the observation that there 
are sub-computations that are relatively cheap and 
as a result do not need tabling (Johnson and Dorre, 
1995; van Noord, 1997). 



3.1 Parse Type Specification 

Combining control strategies depends on a way to 
differentiate between types of constraints. For ex- 
ample, the ALE parser (Carpenter and Penn, 1994) 
presupposes a phrase structure backbone which can 
be used to determine whether a constraint is to be 
interpreted bottom-up or top-down. In the case of 
selective magic parsing we use so-called parse types 
which allow the user to specify how constraints in 
the grammar are to be interpreted. A literal (goal) 
is considered a parse type literal (goal) if it has as its 
single argument a typed feature structure of a type 
specified as a parse type. 10 

All types in the type hierarchy can be used as 
parse types. This way parse type specification sup- 
ports a flexible filtering component which allows us 
to experiment with the role of filtering. However, in 
the remainder we will concentrate on a specific class 
of parse types: We assume the specification of type 
sign and its sub-types as parse types. 11 This choice 
is based on the observation that the constraints on 
type sign and its sub-types play an important guid- 
ing role in the parsing process and are best inter- 
preted bottom-up given the lexical orientation of 
HPSG. The parsing process corresponding to such 
a parse type specification is represented schemati- 
cally in figure 8. Starting from the lexical entries, 




word word word 



Figure 8: Schematic representation of the selective 
magic parsing process 

i. e., the TTC definite clauses that specify the word 
objects in the grammar, phrases are built bottom- 
up by matching the parse type literals of the defi- 
nite clauses in the grammar against the edges in the 

10 The notion of a parse type literal is closely related to 
that of a memo literal as in (Johnson and Dorre, 1995). 

n When a type is specified as a parse type, all its sub- 
types are considered as parse types as well. This is nec- 
essary as otherwise there may exist magic variants of 
definite clauses defining a parse type goal for which no 
magic facts can be derived which means that the magic 
literal of these clauses can be interpreted neither top- 
down nor bottom-up. 



table. The non-parse type literals are processed ac- 
cording to the top-down control strategy described 
in section 3.3. 

3.2 Selective Magic Compilation 

In order to process parse type goals according to a 
semi-naive magic control strategy, we apply magic 
compilation selectively. Only the TTC definite 
clauses in a typed feature grammar which define 
parse type goals are subject to magic compilation. 
The compilation applied to these clauses is identical 
to the magic compilation illustrated in section 2.1 
except that we derive magic rules only for the right- 
hand side literals in a clause which are of a parse 
type. The definite clauses in the grammar defining 
non-parse type goals are not compiled as they will be 
processed using the top-down interpreter described 
in the next section. 

3.3 Advanced Top-down Control 

Non-parse type goals are interpreted using the stan- 
dard interpreter of the ConTroll grammar develop- 
ment system (Gotz and Meurers, 1997b) as devel- 
oped and implemented by Thilo Gotz. This ad- 
vanced top-down interpreter uses a search function 
that allows the user to specify the information on 
which the definite clauses in the grammar are in- 
dexed. An important advantage of deep multiple 
indexing is that the linguist does not have to take 
into account of processing criteria with respect to 
the organization of her/his data as is the case with 
a standard Prolog search function which indexes on 
the functor of the first argument. 

Another important feature of the top-down inter- 
preter is its use of a selection function that interprets 
deterministic goals, i. e., goals which unify with the 
left-hand side literal of exactly one definite clause in 
the grammar, prior to non-deterministic goals. This 
is often referred to as incorporating deterministic 
closure (Dorre, 1993). Deterministic closure accom- 
plishes a reduction of the number of choice points 
that need to be set during processing to a minimum. 
Furthermore, it leads to earlier failure detection. 

Finally, the used top-down interpreter implements 
a powerful coroutining mechanism: 12 At run time 
the processing of a goal is postponed in case it is 
insufficiently instantiated. Whether or not a goal is 
sufficiently instantiated is determined on the basis of 
so-called delay patterns. 1 ^ These are specifications 

12 Coroutining appears under many different guises, 
like for example, suspension, residuation, (goal) freezing, 
and blocking. See also (Colmerauer, 1982; Naish, 1986). 

13 In the literature delay patterns are sometimes also 
referred to as wait declarations or block statements. 



provided by the user that indicate which restrict- 
ing information has to be available before a goal is 
processed. 

3.4 Adapted Semi-naive Bottom-up 
Interpretation 

The definite clauses resulting from selective magic 
transformation are interpreted using a semi-naive 
bottom-up interpreter that is adapted in two re- 
spects. It ensures that non-parse type goals are 
interpreted using the advanced top-down inter- 
preter, and it allows non-parse type goals that re- 
main delayed locally to be passed in and out of 
sub-computations in a similar fashion as proposed 
by (Johnson and Dorre, 1995). In order to accom- 
modate these changes the adapted semi-naive inter- 
preter enables the use of edges which specify delayed 
goals. 

Figure 9 illustrates the adapted match operation. 
The first defining clause of match/3 passes delayed 

match(Edge, edge (Goal, Delayed) .Table) :- 
def inite_clause( (Goal :- Body)), 
select (Lit , Body , Lit s) , 
parse_type (Lit) , 
Edge = edge (Lit, DelayedO) , 
edges (Lit , Table , DelayedO , TopDown) , 
advanced_td_interpret (TopDown, Delayed) . 

match(Edge, edge (Goal, Delayed) .Table) :- 
def inite_clause( (Goal :- TopDown)), 
advanced_td_interpret (TopDown, Delayed) . 

Figure 9: Adapted definition of match/3 

and non-parse type goals of the definite clause under 
consideration to the advanced top-down interpreter 
via the call to advanced_td_interpret/2 as the list 
of goals TopDown. 14 The second defining clause of 
match/3 is added to ensure all right-hand side lit- 
erals are directly passed to the advanced top-down 
interpreter if none of them are of a parse type. 

Allowing edges which specify delayed goals neces- 
sitates the adaption of the definition of edges/3. 
When a parse type literal is matched against an 
edge in the table, the delayed goals specified by that 
edge need to be passed to the top-down interpreter. 
Consider the definition of the predicate edges in 
figure 11. The third argument of the definition of 
edges/4 is used to collect delayed goals. When there 
are no more parse type literals in the right-hand side 
of the definite clause under consideration, the second 

14 The definition of match/3 assumes that there exists 
a strict ordering of the right-hand side literals in the 
definite clauses in the grammar, i. e., parse type literals 
always preced e non-parse type literals. 



edges ( [Lit|Lits] .Table .DelayedO .TopDown) :- 
parse_type(Lit) , 

member (edge(Lit .Delayedl) .Table) , 
append (DelayedO , Delayedl , Delayed) . 
edges (Lit .Table .Delayed, TopDown) . 
edges([] ,_, Delayed, TopDown) : - 
append (Delayed, Lit , TopDown) . 

Figure 11: Adapted definition of edges/4 

defining clause of edges/4 appends the collected de- 
layed goals to the remaining non-parse type literals. 
Subsequently, the resulting list of literals is passed 
up again for advanced top-down interpretation. 

4 Implementation 

The described parser was implemented as part of 
the ConTroll grammar development system (Gotz 
and Meurers, 1997b). Figure 10 shows the over- 
all setup of the ConTroll magic component. The 
Controll magic component presupposes a parse type 
specification and a set of delay patterns to deter- 
mine when non-parse type constraints are to be in- 
terpreted. At run-time the goal-directedness of the 
selective magic parser is further increased by means 
of using the phonology of the natural language ex- 
pression to be parsed as specified by the initial goal 
to restrict the number of facts that are added to the 
table during initialization. Only those facts in the 
grammar corresponding to lexical entries that have 
a value for their phonology feature that appears as 
part of the input string are used to initialize the ta- 
ble. 

The ConTroll magic component was tested with 
a larger (> 5000 lines) HPSG grammar of a size- 
able fragment of German. This grammar provides an 
analysis for simple and complex verb-second, verb- 
first and verb-last sentences with scrambling in the 
mittelfeld, extraposition phenomena, wh-movement 
and topicalization, integrated verb-first parcntheti- 
cals, and an interface to an illocution theory, as well 
as the three kinds of infinitive constructions, nomi- 
nal phrases, and adverbials (Hinrichs et al., 1997). 

As the test grammar combines sub-strings in a 
non-concatenative fashion, a preprocessor is used 
that chunks the input string into linearization do- 
mains. This way the standard ConTroll interpreter 
(as described in section 3.3) achieves parsing times 
of around 1-5 seconds for 5 word sentences and 10- 
60 seconds for 12 word sentences. 15 The use of 
magic compilation on all grammar constraints, i.e., 

15 Parsing with such a grammar is difficult in any sys- 
tem as it does neither have nor allow the extraction of a 
phrase structure backbone. 
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Figure 10: Setup of the ConTroll magic component 



tabling of all sub-computations, leads to an vast in- 
crease of parsing times. The selective magic HPSG 
parser, however, exhibits a significant speedup in 
many cases. For example, parsing with the mod- 
ule of the grammar implementing the analysis of 
nominal phrases is up to nine times faster. At the 
same time though selective magic HPSG parsing is 
sometimes significantly slower. For example, parsing 
of particular sentences exhibiting adverbial subordi- 
nate clauses and long extraction is sometimes more 
than nine times slower. We conjecture that these 
ambiguous results are due to the use of coroutin- 
ing: As the test grammar was implemented using 
the standard ConTroll interpreter, the delay pat- 
terns used presuppose a data-flow corresponding to 
advanced top-down control and are not fine-tuned 
with respect to the data-flow corresponding to the 
selective magic parser. 

Coroutining is a flexible and powerful facility used 
in many grammar development systems and it will 
probably remain indispensable in dealing with many 



control problems despite its various disadvantages. 16 
The test results discussed above indicate that the 
comparison of parsing strategies can be seriously 
hampered by fine-tuning parsing using delay pat- 
terns. We believe therefore that further research 
into the systematics underlying coroutining would 
be desirable. 

5 Concluding Remarks 

We described a selective magic parser for typed fea- 
ture grammars implementing HPSG that combines 
the advantages of dynamic bottom-up and advanced 
top-down control. As a result the parser avoids the 
efficiency problems resulting from the huge space re- 
quirements of storing intermediate results in parsing 



16 Coroutining has a significant run-time overhead 
caused by the necessity to check the instantiation sta- 
tus of a literal/goal. In addition, it demands the proce- 
dural annotation of an otherwise declarative grammar. 
Finally, coroutining presupposes that a grammar writer 
possesses substantial processing expertise. 



with large grammars. The parser allows the user to 
apply magic compilation to specific constraints in 
a grammar which as a result can be processed dy- 
namically in a bottom-up and goal-directed fashion. 
State of the art top-down processing techniques are 
used to deal with the remaining constraints. We 
discussed various aspects concerning the implemen- 
tation of the parser which was developed as part of 
the grammar development system ConTroll. 
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